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Abstract 


Sphericity test plays a key role in many statistical problems. We propose Spear¬ 
man’s rho-type rank test and Kendall’s tau-type rank test for sphericity in the high 
dimensional settings. We show that these two tests are equivalent. Thanks to the 
“blessing of dimension”, we do not need to estimate any nuisance parameters. With¬ 
out estimating the location parameter, we can allow the dimension to be arbitrary 
large. Asymptotic normality of these two tests are also established under elliptical dis¬ 


tributions. Simulations demonstrate that they are very robust and efficient in a wide 
range of settings. 

Key words: Asymptotic normality; Kendall’s tau-type rank test; Large p, small 
n; Spatial rank; Spatial sign; Spearman’s rho-type test; Sphericity test. 

1 Introduction 

Let Xi,, Xn be a random sample from a p-variate elliptical random vectors with scatter 
matrix Sp, which describes the covariances between the p variables. We wish to test the 
following hypothesis 


( 1 ) 


Hq : Tip = alp v.s. Hi : Tp alp. 


Such test play a key role in a number of statistical problems. It aries from several areas 
of statistical applications, such as microarray analysis, geostatistics. When the dimension p 
is hxed, there are a considerable body of literature on this sphericity testing problem. For 
multinormal variables, a classical method to deal with this problem is the likelihood ratio 
test (Mauchly 1940). John (1971, 1972) proposed the statistic 



where S is the sample covariance matrix. He show that it is locally powerful invariant test 
for sphericity under the multivariate normal assumption. Muirhead and Waternaux (1980) 
modihed John’s test statistic to a wider elliptical distribution. 
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With the rapid development of technology, various types of high-dimensional data have 
been generated in many areas, such as hyperspectral imagery, internet portals, microarray 
analysis and DNA. In genomic studies the data dimension can be a lot larger than the sample 
size, say a so-called “large p, small n” case. Recently, many efforts have been devoted to 
sphericity test in high dimensional settings. Bai et ah (2009) propose a corrections to the 
likelihood ratio test by random matrix theory when p/n —)■ c G (0,1). Ledoit and Wolf 
(2002) show that the existing n-asymptotic theory remains valid if p goes to inhnity with 
n, even for the case p > n. Without the normal distribution assumption, Chen, Zhang and 
Zhong (2010) proposed a high-dimensional test based on Qj with two accurate estimators 
for tr(Sp) and tr(Sp). Without specifying explicitly growth rate of p relative to n, they 
showed that their proposed test statistic is asymptotically normal under the diverging factor 
model (Bai and Saranadasa 1996). Though the diverging factor model contains a wide range 
of distributions, it is difficult to justify. Moreover, the multivariate t-distribution or mixture 
of multivariate distribution does not satisfy this model. This motivates us to construct more 
robust tests for sphericity. 

In the traditional hxed p circumstance, multivariate sign- and/or rank-based covariance 
matrices are often used to construct robust test for sphericity. See Hallin and Paindaveine 
(2006) and Oja (2010) for nice overviews of this topic. However, when the dimension is lager 
than the sample sizes, these methods may not work very well. Zou et al. (2014) showed that 
the type I error of those tests based on multivariate signs, such as Marden and Gao (2002), 
Hallin and Paindaveine (2006) and Sirkia et al. (2009), are much larger than the nominal 
level because of the estimation of location parameters. Thus, Zou et al. (2014) propose a bias 
correction procedure to the existing test statistic. However, it only can allow the dimension 
at most being the square of the sample sizes. In practice, the dimension of microarray data 
may be the exponential rate of the sample sizes. It motivates us to construct new tests for 
this ultra-high dimensional cases. 

When p is hxed, Spearman’s rho-type test and Kendall’s tau-type rank test are the 
other two robust and efficient tests for sphericity (Sirkia et al. 2009). However, there are 
many nuisance parameters in these procedures. And those estimators proposed in Sirkia et 
al. (2009) are unrealistic for high dimensional data because of complex calculation or the 
assumption of original location. Moreover, those nature estimators of tr(r2p) or tr(Hp) based 
on the sample symmetrized sign or rank covariance matrix would result in a non-negligible 
bias term when the dimension is ultra-high. In this article, we propose two novel Spearman’s 
rho-type test and Kendall’s tau-type rank test for sphericity in the high dimensional settings. 
Thanks to the “blessing of dimension”, those parameters do not need to estimate anymore. 
Based on the leave out method, there are no bias term in out test statistics. Additionally, 
without estimating the location parameter, we can allow the dimension to be arbitrary large. 
Asymptotic normality of these two tests are also established under elliptical distributions. 
Simulations also demonstrate that the proposed methods work reasonably well not only for 
those elliptical distribution but also for the diverging factor model. 
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2 High-dimensional rank tests 


2.1 High-dimensional Spearman’s rho-type rank test statistic 

Suppose Xi,..., JCn are generated from a p-variate elliptical distribution with density func¬ 
tion det(Sp)“^/^ 5 fp{||Sp— 0p)||}, where ||X|| = (X^X)^/^ is the Euclidean length of 
the vector X, is the symmetry center and Sp is a positive dehnite symmetric pxp scatter 
matrix. Similar to Zou et ah (2014), dehne Sp = UpAp where tr(Ap) = p and cXp is a scaled 
parameter. The hypothesis test ([T]) is equivalent to test 


Hq: Ap = Ip, vs Hi: Ap^ Ip. 


The spatial-rank function is dehned as i?(X) = E(f/(X—X)|X), where f/(X) = ||X||“^XJ(X ^ 
0). The spatial-rank covariance matrix is flp = E{R{X)R{X)'^). Under the null hypothesis, 
lip = TFP~^ip where tf is a constant dependent on pp. Similar to the John’s test, a nature 
distance measure between flp and TFP~^lp is 


ptr 


lip 

tr(lip) 



Pfr(lip) 

tr^(lip) 


In the fixed p cases, we adopt the sample spatial-rank covariance matrix 1I„ p to estimate 

12p, i.e. 


^n,p — 


n 




2=1 




T 

ik 


where Ri = - iji ^ij = U{Xi — Xj). Then, the Spearman’s rho-type rank test 


statistic is dehned as 


Qs = pH 


n 


n,p 


p-%] = 


tr(lln,p 


Ptr(li^,p) 

tr2(II„^p) 


It can be shown that when p is hxed, under the null hypothesis one has 

^ n 2 

^ X(p+2)(p-l)/2 


ls/r\ 


where 75 , Tp are two nuisance parameters dependent on Pp and p. Sirkiii et al. (2009) 
suggest that we can estimate Tp by tr(0„^p)/p. And they suggest two estimators for 75 . 
One is estimated from the dehning formula of 75 . However, it must assume the location 
of Xj to be the origin, which is unrealistic in practice. Additionally, if we standardize the 
samples by the estimated location parameters, as shown in Zou et al. (2014), there would be 
another non-negligible bias term in Qs when p/ti? is large enough. The other estimator of 
75 is a complex symmetric U-statistic, which requires (9(n^p^) computation. And the total 
calculation of Qs is of order O(n^p^) + 0{p^) because of the inverse of covariance matrix of 
nec(12n^p). It is a too complicated calculation for high dimensional data. 
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Fortunately, according to Lemma 1 in the appendix, E{fip) = 0.5p“^Ip(l + o(l)) under 
the null hypothesis as p — )■ cxd. Thus, tr(fip) —)■ 0.5. Thus, we only need to propose a better 
estimator for tr(fip. However, the nature estimator tr(O^p) would result in a non-negligible 
bias term in Qs when p is ultra-high. Based on the leave out method, we dehne the following 
new estimator for tr(Op), 


tr(ri2) = 


2n{n — l)(n — 2)(n — 3) 




^UkiUlpu 


are not equal 


Then, we dehne the following high dimensional Spearman’s rho-type rank test statistic (ab¬ 
breviated as SR hereafter) 

Qs = 4ptr(ri2) - 1 

Obviously, the value of Qs remains nnchanged for Zi = aOXi + c where a is a constant, O 
is an orthogonal matrix and c is a vector of constants. Thus, the test statistic Qs is invariant 
under rotations. The following theorem establishes the asymptotic null distribution of Qs- 

Theorem 1 Under Hq, as n ^ oo and p — )■ oo, Qs/o'o — > -^(0,1), where cTq = 4(p — 
l)/(n(n- l)(p + 2)). 

According to Theorem 1, there are not nuisance parameters in the new proposed test proce¬ 
dure. As n,p goes to inhnity, Qs is asymptotic normal and the variance is only dependent 
on p and n. It can be viewed as the phenomenon of “blessing of dimension”. Moreover, the 
complexity of the entire procednre is only O(n^p), which is eventually less than the classic 
Spearman’s rho-type rank test procednre. 

Theorem 1 also shows that there is no bias term in Qs- So, we do not need a bias- 
correction procednre as Zon et al. (2014). Moreover, we do not reqnire the relationship 
between the sample size n and dimension p. However, the test proposed by Zon et al (2014) 
(abbreviated as SS hereafter) must require the dimension being the sqnare of the sample 
size at most. When p/n^ —)■ cxd, there would be another bias-term in SS test statistic, 
which is difficult to calculate. Simulation studies also demonstrate these resnlts. See more 
information in Section 3. 

Next, we consider the asymptotic distribution of Qs under the alternative Hi : Ap = 
Ip -|- p. Dehne 

al = al + n-^p-^ {SpiiCDl p) + 4^2(0^p)} + 8n“V^ {tr(Ap) - p"^tr2(Aj)} . 

Theorem 2 Suppose that ntr(D^p)/p = 0(1). Under Hi, {Qs — tr(D^p)/p}/(Ti —)■ 
A^(0,1), as p ^ oo,n ^ oo. 

According to Theorem 2, if p = O(n^), Qs has the same power fnnction as the test proposed 
by Zou et al. (2014). However, when p/rU —)■ oo, the variance of SS test statistic will be 
larger than a\ because of the estimation of location parameter Qp. See more discussion abont 
it in Section 3. 

In addition, we could establish the consistency of our high-dimensional Spearman’s rho- 
type rank test based on Theorem 2. 
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Corollary 1 //ntr(D^p)/p —)■ oo, the test Qs/<^o > Za is consistent against Hi as n ^ oo 
and p ^ oo. 

Theorems 1 and 2 also allow us to compare our SR test with the existing work, such 
as Chen et ah (2010). The following corollary concerns the limiting efficiency comparison 
between Chen et ah (2010) test (abbreviated as CZZ hereafter) under multivariate normality 
assumption. 

Corollary 2 IfCi < ntr(D^ p)/p < C 2 , under multi-normal distributions, SR test is asymp¬ 
totically efficient as CZZ test. 

It is worth pointing out that theoretically comparing the proposed test with CZZ test under 
general multivariate distributions turns out to be difficult. This is because the asymptotic va¬ 
lidity of CZZ test relies on the diverging factor model, while elliptical assumption is required 
in Theorems 1 and 2. The distinction and connection between the elliptical distributions 
and the diverging factor model is far from clear in the literature. 


2.2 High-dimensional Kendall’s tau-type rank test statistic 


In this subsection, we consider another efficient sphericity test, Kendall’s tau-type rank test. 
The classic Keandal’s tan covariance matrix is dehned as ^ UnUj,. Under 

Ho, we have E{Sn,p) = Sp = p ^Ip. Thus, the Kendall’s tan test statistic is dehned as 

Qk = ptr(tr“^(H„,p)3„,p - p~%Y = pH{^l^p) - 1 


It can be shown that when p is hxed, under the null hypothesis one has 


—Qk 

'Ik 


X(p+2){p-l)/2 


where 'Jk is another nuisance parameter dependent on pp and p. Similarly, the estimator 
for 'Jk ffi Sirkia et al. (2009) can not be used in high dimensional settings, which requires 
original location or O(n^p^) computation. Thanks for the “blessing of dimension”, we also do 
not need this nuisance parameter in high dimensional data. Moreover, the nature estimator 
tr(H^p) also would result in a non-negligible bias term in Qx when p is ultra-high. Thus, 
based on the leave out method, we propose the following estimator for tr(Hp, 


tr(32) 


n{n 


1 

l)(n — 2)(n 


^ are not equal 


Then, we dehne the following high-dimensional Kendall’s tau-type rank test statistic (ab¬ 
breviated as SK hereafter) 

Qk = ptr(H2) - 1 

Obviously, the test statistic Qk is also invariant under rotations. We can also establish the 
asymptotic properties of Qk as follow. 
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Theorem 3 As n ^ oo and p oo, 

(i) Under Hq, Qk/(Jq 

(ii) Under Hi, if /p = 0(1), {Qk - tr(D2 p)/p}/ai N{0,1 ). 

In fact, as shown in the proof of Theorem 3, Qk is asymptotic equivalent to Qs under both 
null and alternative hypothesis. In high dimensional settings, the Kedah’s tan-type rank 
test is equivalent to the Spearman’s rho-type rank test. Thus, similar to Corollary 1, we 
can also show the consistency of SK test. And SK test is also asymptotic efficient as CZZ 
test under the multinormal distributions by the similar arguments as Corollary 2. We state 
these results in the following corollary. 

Corollary 3 As n —)■ cxd and p —)■ oo, we have 

(i) i/ntr(D^p)/p —)■ oo, the test Qk/o^q > Za is consistent against Hi. 

(ii) if Cl < ntr(D^ p)/p < C 2 , under multi-normal distributions, SK test is asymptotically 
efficient as CZZ test. 

3 Simulation 

We consider the following hve distributions for comparison: 

(I) The standard multivariate normal; 

(II) The standard multivariate t with four degrees of freedom, tp^^, 

(III) Mixtures of two multivariate normal densities Kfp{p,,lp) -|- (1 — fi;)/p(p,, 9Ip), where 
/p(-; •) is the p-variate multivariate normal density. The value k is chosen to be 0.8. 

(IV) The diverging factor model with the standardized Gamma(4, 0.5) distribution; 

(V) The diverging factor model with the standardized t distribution with four degrees of 
freedom, t^. 

Here we choose F = Ip and for each Zi, p independent identically distributed random 
variables ZijA are generated in diverging factor model in Scenarios (IV) and (V). The first 
three scenarios are the well-known multivariate elliptical distributions. However, the last 
two scenarios are not elliptically distributed. We consider the sample sizes n = 20, 30 and 
dimensions p = 100,200,400,800. Similar to Chen et al. (2010), we obtain the observations 
Xi = AYi, where Yi are generated from Scenario (I)-(V) and A = diag{2^/^l[^p], lp_[^p]}, 
[x] denotes the integer truncation of x. Three levels of v were considered: O(size), 0.15 and 
0.3. We compare our high-dimensional Spearman’s rho-type rank test (abbreviated as SR), 
high-dimensional Kendall’s tan test (abbreviated as SK) with the bias-corrected sign test 
proposed by Zou et al. (2014) (abbreviated as SS) and the sphericity test proposed by Chen 
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et al. (2010)(abbreviated as CZZ). Tables [T] and [2] report the empirical sizes and power of 
these fonr tests under Scenarios (I)-(III), (IV)-(V), respectively. 

Firstly, we consider the empirical sizes of these tests. The empirical sizes of SR and SS 
tests are close to the nominal level in al cases, which is not impacted by the dimension. 
However, SS can not control its empirical sizes very well in many cases. Sometimes it is 
a little conservative but sometimes it is too larger than the nominal level. To evaluate the 
impact of dimension to the bias-term of SS, we also report the mean-standard deviation-ratio 
E{T)/ A/var(T) and the variance estimator ratio var(T)/var(T) of these four tests. Since the 
explicit form of E{T) and var(T) is difficult to calculate for all tests, we estimate them by 
simulation. Figures [1] and [2] report the mean-standard deviation-ratio of these four tests. 
Figures [3] and m report the variance estimator ratio of these tests. We observe that the bias 
term in SS is apparently exists, especially when ’pjr? is large. It is not strange because SS 
can only allow the dimension being comparable to the square of the sample size. In contrast, 
the mean-standard deviation-ratio of our SR and SK test statistics is approximately zero, 
which shows that, regardless of the dimension, there is no bias-term in our test statistics. 
Under scenario (III)-(V), the variance estimator ratio of SS is eventually larger than one 
when p/n^ is large. When the dimension gets larger, the bias of spatial-median estimator 
will also increase the variance of SS test statistic. So the empirical sizes of SS is difficult 
to maintain in these cases. However, the variance estimator ratio of our SR and SK test 
statistic is approximately one. Without estimating the location parameter, the variance of 
SR and SK test statistic do not increase with the dimension. In addition, when the sample 
are generated from the diverging factor model, the empirical sizes of CZZ test are a little 
larger than the nominal level in most cases. However, under Scenario (H) and (HI), the 
mean-standard deviation-ratio of CZZ is smaller than zero and the variance estimator ratio 
is eventually larger than one. And then, the empirical sizes of CZZ test are signihcantly 
larger than the nominal level. It is not surprising because neither nor a mixture of 
multivariate normal distributions belongs to the diverging factor model. 

Next, we consider the power comparison of these tests. SR and SK tests perform similar 
to each other, which is consistent with the theoretical results in section 2. In general, both 
SR and SK tests perform a little better than SS test in most cases. The variance of SS 
test statistic will increase faster than SR and SK test statistics because of the estimation of 
location parameters. Then it is not surprising that the power of SS is smaller than these two 
tests. Moreover, the power of SS is larger than SR and SK in some cases, such as scenario 
H with {n,p) = (20, 800). However, the empirical sizes of SS also are lager than the nominal 
level in these cases. Thus its high power would not be very meaningful. In addition, our 
SR and SK test perform similar to CZZ test under normal distributions. Even under the 
non-elliptical distributions (Scenarios (IV) and (V)), the difference between CZZ and SR 
and SK is marginal. However, under two heavy-tailed elliptical distributions (Scenario (H) 
and (HI)), our SR and SK tests performs eventually better than CZZ test. 

All these results suggest that the proposed two test are quite robust and efficient in 
testing sphericity. Without estimating the location parameter, SR and SK tests can control 
their empirical sizes very well and are more powerful than SS test under the alternative 
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Figure 1: The mean-standard deviation-ratio of test statistics under Scenarios (I)-(III). 

hypothesis. For heavy-tailed or skewed distributions, SR and SK tests performs much better 
than CZZ test both in sizes and power. 


4 Discussion 

Multivariate-rank based method is very robust and efficient in constructing test procedure 
in multivariate problems. In this paper, we proposed two novel test statistic for sphericity 
test based on multivariate-rank. We believe that this procedure can be extended to more 
general elliptical distributions with Sp = diag{(Tii, • • • , Upp} where the an are unknown. 
Moreover, high dimensional location testing problem also draw much attention in statistics 
(Chen and Qin 2010). Wang et ah (2015) proposed a high dimensional test for one sample 
location problem based on multivariate-sign. However, the tests for location problem based 
on multivariate-rank deserve future study in high-dimensional settings. 


5 Appendix 

Appendix A: Some useful Lemmas 

Denote Si = Sp^'^^(Xj — 6p) and Uj = E{U{Si — ej)\ei). Obviously, E{uiuf) = Tpp~^lp 
where Tp is a constant depend on distribution Qp and p. 


Lemma 1 —)• 0.5 as p ^ oo. 
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Figure 2: 


The mean-standard deviation-ratio of test statistics under Scenarios (IV)-(V). 
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Figure 3: 


The variance-ratio of test statistics under Scenarios (I)-(III). 
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Table 1: Empirical Size and power comparison at 5% significance under Scenarios (I)-(III) 


(n,p) 

SR 

Size 

SK SS 

CZZ 

SR 

V = 

SK 

0.15 

SS 

CZZ 

SR 

V = 

SK 

0.30 

SS 

CZZ 






Scenario (I) 






(20,100) 

5.8 

5.8 

3.9 

5.8 

24 

24 

16 

26 

33 

33 

25 

34 

(20,200) 

6.3 

6.3 

5.3 

6.5 

28 

28 

23 

29 

36 

36 

22 

36 

(20,400) 

6.3 

6.3 

4.5 

7.6 

26 

26 

14 

27 

34 

33 

20 

35 

(20,800) 

6.0 

6.0 

6.0 

7.6 

25 

25 

21 

26 

36 

36 

21 

37 

(30,100) 

5.6 

5.7 

5.2 

6.1 

39 

39 

34 

41 

52 

52 

48 

55 

(30,200) 

4.9 

4.9 

3.6 

5.5 

42 

42 

34 

43 

56 

56 

51 

56 

(30,400) 

5.1 

5.1 

3.0 

5.1 

40 

40 

22 

41 

56 

56 

43 

57 

(30,800) 

6.5 

6.5 

4.2 

6.8 

41 

41 

30 

42 

55 

55 

47 

56 






Scenario (II) 






(20,100) 

5.0 

5.3 

5.8 

9.7 

24 

26 

23 

21 

30 

32 

32 

25 

(20,200) 

4.9 

5.8 

6.8 

10.1 

26 

28 

28 

22 

32 

35 

35 

27 

(20,400) 

5.9 

6.7 

9.0 

11.5 

25 

27 

28 

22 

32 

34 

34 

27 

(20,800) 

5.0 

5.7 

11.7 

10.1 

24 

26 

33 

22 

34 

37 

45 

28 

(30,100) 

5.7 

4.9 

5.3 

11.6 

37 

40 

38 

28 

48 

51 

50 

34 

(30,200) 

6.0 

5.6 

5.5 

11.0 

40 

43 

41 

30 

52 

56 

55 

39 

(30,400) 

5.2 

5.2 

6.4 

10.8 

38 

41 

41 

30 

52 

55 

57 

37 

(30,800) 

6.5 

6.0 

7.9 

12.0 

38 

41 

42 

31 

50 

53 

57 

38 





Scenario (III) 






(20,100) 

6.2 

6.2 

4.8 

11.4 

21 

23 

21 

19 

29 

31 

28 

23 

(20,200) 

5.9 

5.8 

6.7 

12.2 

25 

27 

26 

22 

32 

35 

30 

25 

(20,400) 

5.8 

6.3 

5.0 

12.7 

25 

27 

23 

21 

34 

35 

28 

24 

(20,800) 

5.2 

5.9 

9.2 

11.9 

24 

27 

29 

21 

34 

37 

29 

26 

(30,100) 

4.6 

6.3 

5.3 

14.9 

36 

41 

38 

31 

48 

54 

50 

37 

(30,200) 

4.8 

4.5 

4.6 

13.7 

38 

42 

41 

29 

50 

54 

54 

35 

(30,400) 

5.7 

5.5 

3.6 

16.8 

37 

41 

36 

31 

52 

57 

54 

37 

(30,800) 

5.8 

5.0 

5.9 

13.4 

37 

41 

40 

28 

51 

55 

55 

35 
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Figure 4: The variance-ratio of tests under Scenarios (IV)-(V). 


Proof. 


E{ejei) =E{{ei - Sj)'^{£i - Sk)) 

=E{E{{ei - £j)'^{£i - Sk)\ei)) 

=E{E{\\£i - SjWWei - £k\\U{£i - £jYU{£i - £k)\£i)) 

=E{{E{\\£, - £,\\\£,)f)E{E{U{£, - e,fu{£, - £k)\ei)) 

In addition, i?(||£j|p) = 0.5i?(||£i — SjW^). Thus, we only need to show that 

E{{E{\\£,-£,\\\£m 

Because £i has the elliptical distribution, £* — £j also has the elliptical distribution. Define 
the density function of ||£j — £j|| is f{t) = Cpt'P~^g{t) where Cp = yIp/I) • Thus, 


B((£(||Ej-£,|||£.)rt 

{J CptPg{t)dtY 

TT 

1 

0) 

to 

f CptP^^g{t)dt 


r^((P + i)/2) 

cpCp+2 r(p/ 2 )r((p + 2 )/ 2 ) 


By the Stirling’s formula. 


lim 

X^OO 


r(a; -F 1 ) 

(a;/e)^(27ra:)i/2 


1 , 
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Table 2: Empirical Size and power comparison at 5% significance under Scenarios (IV)-(V) 




Size 



V = 

0.15 



V = 

0.30 


(n,p) 

SR 

SK 

SS 

czz 

SR 

SK 

SS 

CZZ 

SR 

SK 

SS 

CZZ 






Scenario (IV) 






(20,100) 

4.8 

5.9 

4.9 

7.1 

24 

24 

18 

25 

31 

31 

25 

32 

(20,200) 

5.0 

5.0 

5.8 

7.8 

27 

27 

23 

28 

34 

34 

25 

35 

(20,400) 

4.5 

4.5 

3.4 

7.0 

26 

26 

15 

27 

33 

33 

20 

34 

(20,800) 

5.0 

5.0 

6.6 

7.4 

25 

25 

22 

26 

35 

35 

19 

36 

(30,100) 

4.8 

4.8 

4.6 

6.0 

38 

38 

35 

42 

51 

51 

49 

53 

(30,200) 

5.6 

5.8 

4.7 

6.1 

40 

40 

36 

42 

55 

55 

52 

56 

(30,400) 

5.3 

5.3 

4.2 

5.7 

41 

41 

29 

40 

55 

55 

41 

56 

(30,800) 

5.9 

4.9 

3.8 

7.1 

42 

42 

33 

43 

57 

57 

49 

57 






Scenario (V) 






(20,100) 

5.5 

5.5 

5.9 

9.8 

25 

25 

20 

27 

30 

30 

26 

32 

(20,200) 

4.9 

5.9 

5.8 

9.7 

27 

27 

18 

28 

35 

35 

26 

35 

(20,400) 

4.6 

5.6 

5.6 

6.8 

25 

25 

21 

27 

32 

32 

26 

34 

(20,800) 

5.7 

5.7 

4.9 

7.6 

27 

27 

19 

28 

36 

36 

26 

37 

(30,100) 

4.2 

4.2 

5.8 

8.4 

36 

36 

33 

39 

50 

49 

45 

51 

(30,200) 

5.9 

5.9 

6.2 

8.3 

37 

37 

33 

38 

50 

50 

44 

49 

(30,400) 

4.5 

4.5 

5.0 

7.1 

40 

40 

32 

40 

54 

54 

50 

55 

(30,800) 

4.1 

5.1 

4.7 

7.1 

40 

40 

32 

41 

55 

55 

47 

55 


as p —)■ cxD, we have 


-'p+i 


- llP-i 


-)■ 


(p- 1) 


CpCp+2 pP/2(p _ 2 )(p-2)/2 

Here we complete the proof. 


(1 - + (p - ^ 1. 


□ 


Lemma 2 For any matrix M, we have E{u^yiUjY = O [p ^tr(M^M))-|-0(p ^tr^(M)), j = 

I,-- - ,n. 


Proof. Define M = {aik)lk=v • • • > 


^ip) ? 




\Lk=l 


I I EE 

l,k=l s,t=l 

p p 

EE EE aiiakkE{u^i.u‘fi) 


p p 


k=l 1=1 


k=l 1=1 
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Because E{ul) = 0{p 2), E{ulul) = 0{p ‘^) and 

p p p p 

a-iiakk = tr^ (M), 

k=l 1=1 k=l 1=1 

Thus, E{ujMuif = O (p-2tr(M^M)) + O (p-hi'^iM)). □ 


Lemma 3 Ts n —)■ cxd and p —)■ oo, 




n(n—l) E- 


O-Q 


iV(0,l) 


Proof. Define Vi = Ui/y/rp. Thus, E(vivJ) =p Define 




p 


n{n — 1) 




p 


i¥^j 


n{n — 1) 






The expectation of Qg can be easily verified and thus omitted here. var((5s) can be computed 
as follows: 

var(g'g) ={n{n - 1)}~YE |_ i 

I ij^j } 

={n{n — [2n(n — l)E{vJvj)'^ + 4n(n — l)(n — 2)E \^{vJvjY{vJvkY'^ 

+ n{n - l)(n - 2)(n - 3)E {{vjvj)\vlvif} ] - 1 
=4(p - l)/{n(n - l)(p + 2)}. 


Next, we only need to show the asymptotic normality of Q'g. Let Eo = {0,17}, Ek = 
a{vi, ..., Vk}i k = 1,..., n. Let Ek{-) denote the conditional expectation of given Ek and 
Eo(-) = E{-). Write Qg - E{Q'^) = Gn,k, where Gn,k = {Ek - Ek-i)Qs. Then for every 
n, {Gn,fc}Ei is ^ martingale difference sequence with respect to the a-fields {Ek, 1 < k < n}. 
Let Unk — Ek-i{G\f,). According to the martingale central limit theorem (Hall and Hyde 
1980), we only need to show that, as n —)■ oo, 

Ti.ynM . ,,,, , y:ue(gu) 

-^^1 m probatahty and ^ 0. (2) 

Define r^.i = Ei=i - P~^\) ■ We have 


n 



k=l 


^=1 

n 

= '^4:{n{n - l)}-^p^{vlrk-iVkY 

k=l 


n 

E 


{n(n- 1)}2 ^ 


tr(rti) 
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By noting that 


tr 


n k—1 k—1 

H rLi ) = X] tr { {viV^ - p-%) {vjvJ - p-%) } 

k=l 1=1 j=l 

n{n — l){p — 1) 


k=l 


2p 


^^2 {n - max{i, ])} ti {{vivj-p ^Ip) {vjvJ - p ^Ip) } , 
i¥=j 


we can obtain 


^n,k 


,k=l 


4(p - 1) 

n{n — l)p 


, var 


E 

^ k=l 


(J. 


n,k 


128(n-2)(p- 1) 
3{n(n — 1)Yp‘^{p + 2)' 


Clearly, alJvaiiQ's) 1. 

Finally, we verify that the second part of (E]). Note that 


E 


16p^ 


k=l 


n{n — 1) 


E{vl {vivj -p %)vky 


{n{n-l)Y 

+ n{n-l){n-2)E^ {vl {viv'^ - p-%) v ^)^ {vl{vjvJ - p-% 


Because 


e\W 


Tr T 
VaV, 


T/ T 
^ViV, 


-p ^Ip)t;fc}^ = 0{p ^), 

p-%)vky {vK 


T 


p ^Ip)r;fc}^ = 0{p 


it is straightforward to see Ylk=i ^%ti k) — o{var^(Qg)}. Here we completes the proof of 
this lemma. □ 


Appendix B: Proof of Theorems 
Proof of Theorem 1 We decompose Uij as 

Uij = U{Xi - Xj) = E{U{Xi - X^)\X,) - E{U{Xi - X,)|X,) + 

Under ifo, E{U{Xi — Xj)\Xi) = -Uj. Then, Uij = Ui — Uj + ujij. Obviously, E{ujij) = 0, 
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According to Lemma 1 and 3, we have 

Ji/ao AiV(0,l) 

Thus, we only need to show the other parts are all Op{ao). 

E{JI) =0{p^n~‘^)E{uJujuJukuluiuf Ui) + 0{p^n~^)E{uJujuJukulujuJui) 
=0{p~^n~‘^) + 0{p~^n~^) = o{al), 

E{Jl) =0{p^n-*)E{{uJujuluif) = 0(p"V"^) = o(a^). 

Finally, we only consider the hrst part in J 4 . The proof of the other parts are similar. 

E(0{pn-^) 

are not equal 

=0{p‘^n~^)E{uJujuJoL>kiu^UjU^OL>ki) + 0{p‘^n~'^)E{{uJujuJoL>kif) 
=0{p-^n-^)E{ujliU}ki) + 0{p-^n-^)E{ujliUJki) 

=o{p~^n~^) + o{p~^n~^) = o((To). 

Here we complete the proof. □ 

Proof of Theorem 2 Dehne Vi = E{U{Xi - Xj)\Xi). Similar to the arguments as 
Theorem 1, we can show that 

Now, write Vi = {A}J‘^Ui\/{1 + and then 

E(VfV,f = tr ([B {A'Pu,u,^A'P(l + 

=tr [{B {Afu.u,^Af) }"] + tr ([B {C,Al,/^-u,u,^ Af }]") , 
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where Ci is a bounded random variable between 

21 


-1 and — (1 + uJ'Dn,pUi 


\-2 


Obviously, 


tr 

and Lemma 2, 




= Tpp Hr^Ap) = Tpp ^(p+tr(D^ p)). By the Cauchy inequality 


tr (^[E 

<CtT (Ay^'Ui'Ui'^Ay^)^l E ^^[uJl)n,pUi) 

»2 




<Cp tr(Ap)tr(D„p) = Op {p + tr(D„p)}tr(D„p) = o(p n 


by the condition tr(D^p) = 0{n ^p). Consequently, E{Q'^) = ptr(Ap) — 1 + o(n ^). Taking 
the same procedure as EKVfVj)"^}, we can obtain that 


EiVjVj)^ = {3tr2(A2) + 6tr(Aj)}/{p(p + 2)(p + 4)(p + 6)}[l + 0{p 2tr(D2_^)}], 
E{{VjV,r{VfV,r} = {tr2(A2) + 2tr(Aj)}/{p3(p + 2)}[l + 0{p-2tr(Dy}]. 


And then. 


var 


n{n — 1) 






4tr2(A2) 8{ptr(A^)-tr2(A2)} 


\_n{n — l)p^ 


+ 


(n — l)p"^ 


{1 + 0 ( 1 )}. 


Thus, 


E{Qs) = tT{Dlp)/p +o{n ^), 

4tr^(Ap) 8{tr(Ap) - p“Hr^(Ap)} 


var(( 5 s) = 


n{n — l)p^ 


(n — l)p^ 


{1 + 0 ( 1 )}. 


It suffices to show that = {n{n — 1)} ^ MvJv.Y is asymptotically normal. Ob¬ 
viously, 


var^(T„) > K max 


{tr(AO-p-Hr2(AO}tr2(AO tr^(A2) 


n{n — l)^p^ 


’ {n{n — l)}^p^ 


for sufficiently large n, where K is some constant. 

Then we also use the martingale central limit theorem (Hall and Hyde 1980) to prove 
the asymptotical normality. For this purpose, let Eq = {0, Li}, Ek = (^{Vi, ..., Vk}, k = 
1,... ,n. Let Ek{-) denote the conditional expectation of given Ek and Eq{-) = E{-). Write 
Tn - E{Tn) = YJk=i ^n,k, where Gn,k = {Ek - Ek-i)Tn. Then for every n, {Gn,k}l=i 
is a martingale difference sequence with respect to the a-helds {Ek-, 1 < k < n}. Let 
Unk — Ek-i{Gni^). It suffices to show that, as n —)■ cxd. 


2 

—^1 in probability and 
var(T„) 


ELi EjGj,) 

var 2 (Tn) 


( 1 ) 
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As E{J2k=i'^n,k) = var(T„), to see the first part of ([I]), we only show var(^^^^ = 

o{var2(T„)}. Define 2E{ViVj) = Tp and = E?=i' {‘^ViVj - Tp) . By the same pro¬ 
cedure as EKVjVj)'^}, 

<k =Ek-i{Glk) 

_■ 8p2 {tr(rfc_iAp)2tr2(Ap) - tr2(rfc_iAp)tr(Ap} 

{n{n — l)y tr'^(Ap) 

IQp^ {tr(rfc_iA^)tr^(Ap) - tr(rfc_iAp)tr2(Ap)} 
v?{n — l) tr^(Ap) 


8 p Mtr(A^)-p-Hr^(A^)} ' 

v? tr^(Ap) 


[l + o{p ^(D^ p)}] . 


Then 


O'n^k ~ {El,n + R2,n + Rz,n + Ri,n + Rb,n + C'){1 + o(l)}, 


k=l 


where (T is a constant, and 
Rl,n 
R2,n 
R^,n 

RijU ~ 


32p^ tr^(A^) ELi(fc - l)(Er=i ApV.) 

{n(n —1)}2 tr^(Ap) 

32p2 ELi(fc-l)(Eti^fA3F,) 


Rb,n 


{n{n — 

1)}^ tr3(Ap) 

32p2 

(ELi El'i v^Klvy 

n^(n — 1) 

tr3(Ap) 

32p2 

tr^(A^)(ELiEl">fApV.) 

n^(n — 

1) tr5(Ap) 

32p2 

ELiEtiEK(^fApV,)2 


{n{n — l)Y tr^(Ap) 

It suffices to show var(i?j^„) = o{var^(T„)} for i = 1,..., 6. Using 

'k-l 


var 




k=l 


, 2=1 


E 

2=1 

n 

E 

2=1 


(n — + i — ly 


E{VjK,V,f - {E(Vjt.,V,)Y 


;n-i)^(n + i- 1)^] {tr(A*)-p ‘tr^(Ag)} 


7 


4p2 


{ 1 + 0 ( 1 )}, 
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we have 


var(i?i,n) ^ tr^(A^) 
var 2 (T„) “ tr^(Ap) 


By carrying out similar procedures we can show that var(i?j^„) = o{var^(T„)} for i = 1,..., 6, 
and hence complete the proof for the hrst part of ([I]). 

To show the second part of ([T]), 


5 ^ £«,) <}^e1^2VIT^V, - tr(rj) j 


k=l 


128p^ 


{n(n — 1)}^ 


J^EUvlrk-iVk-tTiTk-irA 

k=l ^ ' 


By some algebra, we get 


E 


2VlTpVk 


tr(rj) 


^ tr(A^) {tr(A4) - p Hr2(A2)} 

- 


which leads to 



< K 


fa(A^) 

tr 2 (A 2 )' 


By the Cauchy inequality, tr(D^p) < tr2(D2 p) and tr2(D^p) < tr(D^p)tr(D2 p), so tr(A^) = 
o(p2) = o(tr2(Ap)) by the condition tr(D2p) = 0{n~^p). Thus, k — 

1 ^ 

tr(rp > = o(var2(T„)). Similarly, we can get 


128p^ 


{n{n — 1 )}^ 


'^E[2VlTk-iVk-ir{Tk-iTp)\ = o{ 

k=l ^ ' 


var 2 (T„)}. 


Here we can complete the proof for the second part of ([T]). 


□ 
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Proof of Theorem 3 Under Hq, similar to Qs, we decompose Qk as follow, 


Qk 


P 


n{n — l)(n — 2)(n — 3) 




ij,k,l are not equal 


p 


r^EEEE {{Ui - Uj + U)ijf'{Uk -Ui + L>Jkl)f - 1 


n(n — l)(n — 2)(n — 3) 

i,j,k,l are not equal 

4p n2 .2 


n{n — 1) 


+ 






i¥=j 


n{n — l){n — 2) ^ i 

i,j,k are not equal 


p 


n{n — l)(n — 2){n — 3) 




ij,k,l are not equal 

+ 0{pn~^) uiu,u; i^Jjk + 0{pn EEEE ujukuju^ki 

i,j^k are not equal hj^k,l are not equal 

+ 0{pn~^) 

i,j,k,l are not equal 

+ 0{pn~^) J2J2Y1 '(l-2rF)) 

i,j,k are not equal 

+ 0(pn“^) EEEE«“U«t-(i-2^ft) 

i,j,k,l are not equal 


According to the proof of Theorem 1, we only need to show the last two parts are Op((Tg). 
E(0{pn-^) -P~\1-2 tf))^ 

i,j,k are not equal 

=0{p‘^n-^)E{{{uJu)jkY - P~\^ - 

+ 0{p‘^n-^)E {{{ujujjkf - P~\l - ^tf)) {{ufujjkY-p~\l - 2 tf))) 

=0{p\-^) {EiiuJujjkY) -p~^{l - 2TFf) 

+0{p^n~‘^) [E{{uJujjkf{uJujjkf) - 2TFf) 

=o{n-^) + oin-"^) = o{al), 

E (o{pn~‘^) EEEE«“U«t-(i-2 ^ft)) 

\ hjyk,l are not equal / 

- (1 - 2 rF)^) + - (1 - 2TFf) 


Thus, we proof result (i). Similarly, we can also proof the result (ii) under Hi. □ 


Appendix C: Proof of Corollaries 
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Proof of Corollary 1 From Theorems 1-2, 


U 1 - limsup 4 I 


lim inf pr 


Obviously, a^/ai = 0(1) due to tr(Ap) —p Tr^(Ap) > 0. Denote 


Tin — 


8{tr(A4)-p Hr 2 (Ap} 


^ 8 {tr(A^)tr^(Ap) + tr=^(Ap - 2tr(Ap)tr(Aptr(A3)} 

tr2(A2)p2 

Firstly, consider the case p/tr(D^p) = o(l). The condition ntr(D^p)/p —)■ oo leads to 

_ d _= o|_d_UolAALl 

\nHr‘‘{Dipj \ntr 2 (DJj)J 

1 , 


which implies the assertion of Corollary 1. For the case p/tr(D^p) = 0(1), it can be seen that 
l 2 nhin = 0(1). By Theorem 4-(i) in Chen et ah (2010), we have 72 n/{j^P“^tr^(D^p)} —)■ 0 
from which the corollary follows immediately. □ 

Proof of Corollary 2 By Theorem 1 in Chen et ah (2010), 

C„-tr(D 7 )/p 


/4n-2 + 72„n- 


^A( 0 , 1 ) 


in distribution, where Cn is the test statistic proposed by Chen et al. (2010). Thus, the 
power function of 0„ is 




^An 2 72„n" 


0 (Dn,p)/P 

^4n“2 + 72nn“ 


According to Theorem 1 and 2, the power function oi Qs is 


Obviously, (Tq = 2n ^(1 -|- o(l)) as p oo. Then, the asymptotic relative efficiency of Qs 
with respect to C„ is one in this case. □ 

Proof of Corollary 3 According to the proof of Theorem 3 (ii), Qk = Qs + Op{ai). Thus, 
by Corollaries 1 and 2, we can easily obtain the results. □ 
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